

\subsection{Default values design}

Java first initializes fields with either $0$, \texttt{false}, or \texttt{null}
(depending on the field type) and then runs the constructor to initialize
the fields according to the programmer's wishes.  If every X10 type had a
default value that was statically known, then Java's object initialization
scheme could be used in X10.  This would have the advantage of familiarity for
Java programmers that are learning X10.  The disadvantages are that
observing final fields changing value is nonintuitive, and that reading the
field before initialization is prone to undetectable errors.

Unfortunately, it is hard to reconcile the notion of a default value with X10's
type system, because a programmer can define a type which does not contain a
default value.  In the X10 type system, one can define a type with no values at
all, by using a constraint that yields a contradiction.

This could be addressed by extending the X10 types to require that the
programmer define a new constant value for any type that has been constrained
enough that the original default value is no longer a member of the type.  This
means that every field can be initialized to the value defined in its type.  The
disadvantage of this approach is that the type system becomes more complex and more type
annotations are required.  We decided that this, in combination with the
disadvantages given above, was too problematic to justify the advantages of
Java-style object initialization.

For example, consider a field \code{birthday} that cannot be \code{null}, i.e., it does not have a default value:
\begin{lstlisting}
class Person { val birthday:Date{self!=null}; ...}
\end{lstlisting}
Then in this design the programmer would have to specify a default value for this field,
    e.g., \code{Date.MIN\_DATE}.
The field would be initialized with this default value, and its final value would be assigned only later.
Specifying default values seems like a cumbersome design,
    and it would not have the important property that a final field
        has a single value and it can only be read after it was assigned.
%However, in most cases this default value is not even used because
%    a final field should be read only after it was written.
%Therefore, forcing a programmer to assign default values seemed like a problematic design.

\subsection{Proto Design}

If we want to allow some of the programs that the hardhat design rejects, such
as immutable cycles in the object graph, but we do not want to burden the type
system with default value annotations, then one solution is to allow
\texttt{this} to escape in certain cases while still preventing reads from
uninitialized fields.  This can be achieved by annotating reference types with
a keyword \texttt{proto} to indicate that the referenced object is partially
constructed.  Reads of fields where the target object is \texttt{proto}
are not allowed because a partially constructed object may not have
initialized its fields yet.  The advantage of this approach is that it allows a set
of partially constructed objects to establish themselves as a mutually
referential cycle of objects in the heap, which would not otherwise be possible.
The disadvantage is that it requires an additional type annotation, although this
annotation is only required in code that creates immutable cyclic heap
structures.  Also note that there are no additional space or runtime overheads
since these extra type system mechanisms are for static checking only.

An example of an immutable cycle of two nodes is given in
Fig.~\ref{Figure:Cyclic}.  A more practical but less concise example would be an
immutable doubly-linked list.  Let us assume that we would like to optimize
away any null reference checks, so we constrain all references to exclude the
null value.  The commented out lines indicate code that would be rejected by
the type system.

\begin{figure}
\begin{lstlisting}
class C {
  val next : C {self!=null};
  var fld : C;
  def this(n : proto C{self!=null}) {
    //Console.OUT.println(n.next); //err1
    //n.f(); //err2
    this.next = n;
  }
  def this() {
    //Console.OUT.println(this.next); //err3
    //this.f(); //err4
    val c = new C(this);
    //Console.OUT.println(c.next.next); //err5
    this.next = c;
  }
  def f() {
    Console.OUT.println(this.next);
  }
  def this(c : C, Int) {
    //c.m(this); //err6
    Console.OUT.println(c.fld.next);
    this.next = c;
  }
  void m (n : proto C) proto {
    this.fld = n;
  }
  static def test() {
    val c:C{self!=null} = new C();
    assert c.next.next==c;
  }
}
\end{lstlisting}
\caption{An immutable cycle of heap references, using \texttt{proto}.}
\label{Figure:Cyclic}
\end{figure}

In all constructors, \texttt{this} is a reference to a partially
    constructed object.
If the type of \texttt{this} were to be explicit, it would
be \texttt{proto C\{self!=null\}}.  The \texttt{proto} element of the type
forbids any field reads.  It also prevents the reference being leaked (e.g.
into \texttt{f()}), except into variables of \texttt{proto} type where it
follows that there is protection from uninitialized field reads.

The first constructor's \texttt{n} parameter takes a \texttt{proto} reference
to the original \texttt{C} instance.  It is limited in what it can do with
\texttt{n}, e.g. it cannot read \texttt{n.next}, but it can initialize its own
\texttt{next} field with the passed-in value.
When the second constructor returns, both objects are fully constructed with
all fields initialized.  Thus, the type of the variable \texttt{c} does not
have a proto annotation and the field read \texttt{c.next} is allowed.

If a type has the \texttt{proto} keyword, then its fields (both var and val)
may have partially constructed objects assigned to them, but fields may not be
read.  Conversely, the absence of \texttt{proto} means that the fields may be
read but var fields may not have partially constructed objects assigned to them.
This means that \texttt{proto C} and \texttt{C} are not related by subtyping.
In other words, \texttt{proto C} means definitely partially constructed and
\texttt{C} means definitely fully constructed.  Consequently it makes no sense
to allow casting between the two types, and one may not extend or implement a
proto type.  The only sources of \texttt{proto} typed objects are via the
\texttt{this} keyword in a constructor and via method parameters of
\texttt{proto} type.  The only way a type can lose its \texttt{proto} is by
becoming fully constructed.

Consider \code{\itshape err5} in Fig.~\ref{Figure:Cyclic}.
If we had inferred the type of \code{c} to be non-proto,
    then we could have read the uninitialized field \code{c.next.next} (because \code{c.next} points to \code{this}).
To solve this problem, we must ensure that the whole cycle becomes fully
constructed together.
This can be arranged by changing the type of \texttt{new
C(...)} to be \texttt{proto C} if one of its arguments is of \texttt{proto}
type.  This does not affect the assignment \texttt{this.next = c} because
\texttt{this} is \texttt{proto}.

We do not allow fields to have \texttt{proto} type.  This is because the
referenced object will eventually be fully constructed and then there would be
a variable of \texttt{proto} type pointing to a fully constructed instance.
This admits the possibility of someone assigning a partially constructed object
to a field of the fully constructed object, just as was done in the first
constructor in Fig.~\ref{Figure:Cyclic}.  Then, one could accidentally read an
uninitialized field from the partially constructed object by going through the
fully constructed object.  Disallowing \texttt{proto} in fields avoids this
problem.  However local variables are safe because of lexical scoping, since they
will go out of scope before the constructor returns and the object becomes
fully constructed.

%If one examines the state of the heap as these examples execute, there can be
%seen to be a subgraph of 2 partially-constructed objects, which is completely
%isolated except for references from the stack of the thread which is executing
%the constructors.  The type rules described hitherto ensure the heap has the
%property that partially constructed objects are isolated from the rest of the
%heap until their construction is completed.
%
%We are not aware of any utility in throwing or catching \texttt{proto} types so
%we avoid issues relating to partially constructed objects escaping via the
%exception mechanism, by simply forbidding the throwing and catching of
%\texttt{proto} exceptions.

There would be an issue calling other instance methods on \texttt{this} from a
constructor, because the type of \texttt{this} in those methods would need to
be \texttt{proto} since the target is still partially constructed.  We support
this by allowing the \texttt{proto} keyword to also be used on a method as an
effect annotation, i.e. it must be preserved by inheritance.  Such methods are
called \texttt{proto} methods and can be called on partially constructed
targets.  The type of \texttt{this} then subjects the body of the method to the
same restrictions as we have already seen in constructor bodies.

However in some cases we would like to avoid code duplication by allowing some
methods to be callable on both \texttt{proto} and non-\texttt{proto} targets.
This violates our principle that the two kinds of objects enjoy different
privileges and are completely distinct.
The error \code{\itshape err6} in Fig.~\ref{Figure:Cyclic}
shows how we could potentially read an uninitialized field if we allowed this
relaxation.

To address this, we only allow the method to be called on non-\texttt{proto}
targets if there are no \texttt{proto} parameters to the method.  No such
parameters means the only partially constructed object in scope is
\texttt{this}.  In the case where the method is called on a non-\texttt{proto}
target there is therefore no partially constructed object in scope, and no harm
can be done.

While we believe this type system is correct and usable for writing real
programs in the X10 language, we had to decide whether the additional type
system complexity and annotations were a reasonable price to pay for the
additional expressiveness (i.e. the ability to construct immutable heap
cycles).  We ultimately decided that immutable heap cycles are too rare in
practice to justify including these extra mechanisms in the language.
